BUG: DatetimeIndex.insert doesnt preserve name and tz #7299

sinhrks · 2014-05-31T22:06:03Z

DatetimeIndex.insert doesn't preserve name and tz attributes. Also modified DatetimeIndex.asobject to return an object Index which has the same name as original to cover the case when the result is being object Index.

# normal Index preserves its name after insertion
idx = pd.Index([1, 2, 3], name='normal')
inserted = idx.insert(0, 1)
inserted.name
# normal

# But DatetimeIndex doesn't
dtidx = pd.DatetimeIndex([datetime.datetime(2011, 1, 3), datetime.datetime(2011, 1, 4)], freq='M', name='dtidx')
inserted = dtidx.insert(0, datetime.datetime(2011, 1, 5))
inserted.name
# None

jreback · 2014-06-01T16:40:27Z

did you test if adding a new date to a time-series with a given frequency then changes to None if the inferred different? (e.g. adding a 2nd day of the month to a MS or something). note that I think this check should be done before passing to the constructor as its more efficient (no need to really reinfer the freq). - note that this is NOT true for deleting (and already merged that)

sinhrks · 2014-06-02T13:45:51Z

@jreback Added a logic to check the item is on freq before inserting. Is it correct that inserting '2011-01-07' to Daily frequency index [2011-01-01, 2011-02, 2011-03'] is still Daily frequency? (Frequency is valid if all the values are on the offset, and unrelated to sorted / continuous).

And should set inferred_freq in case of the result's freq is None?

jreback · 2014-06-02T13:53:29Z

I am pretty sure that insert will almost always reset the freq to None (I think it reinfers inside the constructor).

just want to tests (and you example is not True, freq is None)

In [4]: i = pd.date_range('20130101',periods=3).insert(3,Timestamp('20130107'))

In [5]: i
Out[5]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01, ..., 2013-01-07]
Length: 4, Freq: None, Timezone: None

In [6]: pd.infer_freq(i)

immerrr · 2014-06-02T14:31:28Z

Sorry, for barging in, I could've sworn I've seen code like this elsewhere. I've finally remembered: it was the subclassing guide. in numpy docs.

As long as Index is a subclass of ndarray (btw, are there plans to decouple Index from ndarray?), it seems a better way to overload __array_wrap__ to transfer metadata to new objects, otherwise you're going to need similar workarounds for a lot of functions in numpy.

jreback · 2014-06-02T14:36:07Z

hmm, @immerrr that is right....a better way (I thought that was already done....guess not)

yes, in theory plans to decouple Index from ndarray, but @jtratner not really actively working on this. Its not that hard actually. interested?

sinhrks · 2014-06-02T14:37:00Z

OK. Based on inferred_freq, I understand that values are all on the offset, continuous and sorted to be the specific frequency.

Because timezone is corrupted if values are directly passed to DatetimeIndex.__init__, I use _simple_new and set inferred_freq separately (like current union does). I'll check __array_wrap__ if it makes better implementation.

didx = pd.DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-30'])
didx.inferred_freq
# M

didx = pd.DatetimeIndex(['2000-01-31', '2000-03-31', '2000-02-29', '2000-04-30'])
didx.inferred_freq
# None

didx = pd.DatetimeIndex(['2000-01-31', '2000-01-31', '2000-02-29', '2000-03-31', '2000-04-30'])
didx.inferred_freq
# None

didx = pd.DatetimeIndex(['2000-01-31', '2000-03-31', '2000-04-30'])
didx.inferred_freq
# None

#7302 should also be fixed because delete may result in intermittent values?
Fortunately I think my other previous PR looks not affected...

immerrr · 2014-06-02T14:39:27Z

@jreback I'm now in a rather deadline-ish mode just skimming through mail in spare time. Once I'm done with that, I could take a stab at it.

jreback · 2014-06-02T15:28:50Z

@sinhrks yep I think that is correct; you basically DON't want to pass in a freq when you insert and let it reinfer (what I was saying is a) test this b) you can prob do a quick check if the inserted matches the same freq and would be faster so you CAN pass in a new freq).

also, I think delete should work the same (so need to fix that too).

maybe a function _maybe_infer or soething which you pass the kwargs and the new array

rosnfeld · 2014-06-02T21:27:51Z

Edit: moving my comment to #7302

sinhrks · 2014-06-03T11:42:13Z

@jreback. Thanks. But I think checking whether the constructed DTI meets the freq in advance is little difficult in some cases. For example:

# If target is on the edge, easy to check
pd.DatetimeIndex(['2011-01-01, '2011-01-02', '2011-01-03', '2011-01-04']).insert(4, pd.Timestamp('2011-01-05'))

# But following is difficult, calling infer_freq after construction is easier.
pd.DatetimeIndex(['2011-01-01, '2011-01-03', '2011-01-04']).insert(1, pd.Timestamp('2011-01-02'))

I've modified to call infer_freq after index construction, but is it different from your idea?

jreback · 2014-06-03T11:53:32Z

I think the constructor should infer it (I just don't remember exactly how/when this happens), and in fact you may not want to infer it (e.g. let it happen automatically if necessary). Which I think is the prior behavior. So maybe don't don't explicity infer at all (i was just thinking it would be faster/easier to do that).

jreback · 2014-06-03T12:09:37Z

@sinhrks I was thinking that checking the edge cases only for both insert/delete makes sense (for delete its easy as it would be the same freq), insert if the new stamp obeys the current freq. otherwise just pass thru freq=None which doesn't automatically infer (i think), also see @rosnfeld comment #7302 (which I think is related to the automatic inferring)

sinhrks · 2014-06-03T13:20:34Z

@jreback OK. I've modified, and hope to meet what you saying. I feel the check is very specific logic and not split to a separate function for now.

jreback · 2014-06-03T13:22:18Z

that looks right

jreback · 2014-06-03T13:22:49Z

@rosnfeld can you pull in this PR and see if you are talking about in #7302 is changed / same

sinhrks · 2014-06-03T13:33:09Z

Thanks. I understand @rosnfeld is pointing the difference caused by #7302 (Index.delete), not included in this PR (Index.insert). I've prepared the same fix for #7302 and issue separate PR.

jreback · 2014-06-03T13:36:07Z

ok, that's fine (but its actually the same issue),

rosnfeld · 2014-06-03T13:57:56Z

I don't see the code for this pull request changing anything about the delete behavior from what was already done in #7302, but #7320 breaks things, as noted on that PR. I can continue the conversation here or there, whichever you prefer.

jreback · 2014-06-05T16:01:53Z

@sinhrks pls fix this up similarly to .delete and I think good 2 go

sinhrks · 2014-06-06T11:32:22Z

@jreback Because .insert accepts only a single value, current logic should work.

numpy.insert can accepts list of indices/values as of v1.8, it is an option to change the API. To do this, normal Index should also be fixed.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.insert.html

BUG: DatetimeIndex.insert doesnt preserve name and tz

jreback · 2014-06-06T12:58:56Z

this is fine for now...in the future could extend it; maybe add an issue for 0.15?

jreback added Bug labels Jun 1, 2014

jreback added this to the 0.14.1 milestone Jun 1, 2014

sinhrks mentioned this pull request Jun 2, 2014

BUG: inferred_freq results in None for eastern hemispheres timezones #7310

Closed

rosnfeld mentioned this pull request Jun 2, 2014

BUG: Index.delete doesnt preserve name and other attrs #7302

Merged

sinhrks mentioned this pull request Jun 3, 2014

BUG: Revisit Index.delete preserves freq #7320

Merged

BUG: DatetimeIndes.insert doesnt preserve name and tz

e22c856

jreback added a commit that referenced this pull request Jun 6, 2014

Merge pull request #7299 from sinhrks/insert

64d2160

BUG: DatetimeIndex.insert doesnt preserve name and tz

jreback merged commit 64d2160 into pandas-dev:master Jun 6, 2014

sinhrks deleted the insert branch June 6, 2014 14:05

jreback mentioned this pull request Jun 6, 2014

BUG: Datetime.index issues from 7299 #7378

Closed

sinhrks mentioned this pull request Jun 6, 2014

API: allow multiple indices in Index.insert #7384

Closed

sinhrks restored the insert branch June 6, 2014 22:32

sinhrks deleted the insert branch June 11, 2014 13:11

sinhrks mentioned this pull request Jun 29, 2014

BUG: DatetimeIndex.delete with tz raises ValueError #7608

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: DatetimeIndex.insert doesnt preserve name and tz #7299

BUG: DatetimeIndex.insert doesnt preserve name and tz #7299

sinhrks commented May 31, 2014

jreback commented Jun 1, 2014

sinhrks commented Jun 2, 2014

jreback commented Jun 2, 2014

immerrr commented Jun 2, 2014

jreback commented Jun 2, 2014

sinhrks commented Jun 2, 2014

immerrr commented Jun 2, 2014

jreback commented Jun 2, 2014

rosnfeld commented Jun 2, 2014

sinhrks commented Jun 3, 2014

jreback commented Jun 3, 2014

jreback commented Jun 3, 2014

sinhrks commented Jun 3, 2014

jreback commented Jun 3, 2014

jreback commented Jun 3, 2014

sinhrks commented Jun 3, 2014

jreback commented Jun 3, 2014

rosnfeld commented Jun 3, 2014

jreback commented Jun 5, 2014

sinhrks commented Jun 6, 2014

jreback commented Jun 6, 2014

BUG: DatetimeIndex.insert doesnt preserve name and tz #7299

BUG: DatetimeIndex.insert doesnt preserve name and tz #7299

Conversation

sinhrks commented May 31, 2014

jreback commented Jun 1, 2014

sinhrks commented Jun 2, 2014

jreback commented Jun 2, 2014

immerrr commented Jun 2, 2014

jreback commented Jun 2, 2014

sinhrks commented Jun 2, 2014

immerrr commented Jun 2, 2014

jreback commented Jun 2, 2014

rosnfeld commented Jun 2, 2014

sinhrks commented Jun 3, 2014

jreback commented Jun 3, 2014

jreback commented Jun 3, 2014

sinhrks commented Jun 3, 2014

jreback commented Jun 3, 2014

jreback commented Jun 3, 2014

sinhrks commented Jun 3, 2014

jreback commented Jun 3, 2014

rosnfeld commented Jun 3, 2014

jreback commented Jun 5, 2014

sinhrks commented Jun 6, 2014

jreback commented Jun 6, 2014